See copyright notice at the bottom of this page.
List of All Posters
UZR, 2000-2003 (December 21, 2003)
Discussion ThreadPosted 12:46 p.m.,
December 24, 2003
(#9) -
Jay Jaffe
(homepage)
Please stop me before I misunderstand something here. At the top of Tango's comparison he writes:
UZR: MGL's UZR runs
Pinto: (Actual Outs - Expected Outs)*.8
I gather from the comparison that both of these are thus being expressed in runs, but I don't understand the .8. Are you saying that every ball which we expect a fielder to have gotten to that he didn't -- regardless of position -- is worth 0.8 runs?
Without some further explanation, that seems like a wildly inaccurate way of measuring the impact of a missed out. Please enlighten me as to what I'm missing here.
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 12:15 p.m.,
January 27, 2004
(#8) -
Jay Jaffe
(homepage)
Hi guys,
I just wanted to note that since this article's posting here I've added a significant chunk which I think will interest you, so you may want to re-read it (I had planned this as a follow-up but was able to put it together more easily than I'd suspected, so I've squeezed it in). For DIPS 1.x, Voros published season-to-season correlation data between various components ($SO, $BB, $HR, $H) and for dERA & ERA, but to the best of my knowledge did not do the same for DIPS 2.0. I have done so for the two seasons of data that I hold, and the results are pretty consistent with his findings. The formatting won't hold here, but if you want to skim it, it's here:
category/DIPS 1.x/DIPS 2.0
Years/98-99/02-03
Baseline IP/162/162
Number of P/60/56
$BB/.681/.673
$SO/.792/.801
$HR/.505/.372
$H/.153/.106
Years/93-99/02-03
Baseline IP/100/100
Number of P/503/96
ERA to ERA/.407/.378
ERA to dERA/.521/.524
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 12:19 p.m.,
January 27, 2004
(#9) -
Jay Jaffe
(homepage)
Oops, those last two lines should read:
ERA to next ERA/.407/.378
dERA to next ERA/.521/.524
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 3:27 p.m.,
January 27, 2004
(#17) -
Jay Jaffe
(homepage)
Andrew,
I think it's far more likely that even a saber-savvy front office such as Toronto's uses more information than just one year of DIPS 2.0 to make their assessments. To assume otherwise is to oversimplify the matter greatly.
They may have a DIPS 3.0, which leaves DIPS 2.0 in the dust but which they own entirely and have no obligation to share with us.
They may think DIPS is hogwash -- even among statheads, not everyone subscribes to the theory, after all -- and have some other means of projecting pitchers based on several years of statistics, weighted in some fashion.
They may have a proprietary system which spits out PECOTA-like projections that take into account physical characteristics as well (FWIW, PECOTA doesn't like Lilly or Hentgen much; weighted mean forecasts for '04 are 5.07 and 5.06 ERA respectively, Batista 4.19).
They may have an in-house stathead jumping up and down yelling, "Sign these guys!" who is overruled by a GM who has financial realities and a broader long-term picture of the organization in mind.
They may have scouts who say that Hentgen's picked up 10 MPH on his heater in the past six months and a pitching coach who knows Ted Lilly's best friend and so thinks he can connect with the enigmatic lefty.
Even those possibilities are oversimplifications. And it just may be that evaluating a front office based on three transactions is an example of letting a small sample size influence your conclusions.
For what it's worth, I'd say Batista was a very good signing because his performance has been steadily improving, Lilly a decent one because he's looked like he's very close to putting it all together at times (especially the end of last year), and Hentgen a kinda lousy one because he has very little upside other than munching about 150 innings at league average.
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 4:08 p.m.,
January 27, 2004
(#20) -
Jay Jaffe
(homepage)
tango & studes -- I revised my page to include two links to FIP, one at each of your sites. If and when I get a chance, I'll insert a paragraph incorporating a brief description and the data above. Thanks...
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 10:28 p.m.,
January 27, 2004
(#24) -
Jay Jaffe
(homepage)
Re the Dodgers:
140 HR were hit by the Dodgers and their opponents in LA.
111 HR were hit by the Dodgers and their opponents away from LA, by far the lowest total in baseball (Montreal at 128 was next).
That latter figure has as much to do with the Dodgers' pathetic offense as their spectacular pitching, both of which were extreme even away from the distortion field of Chavez Ravine. No team came close to hitting fewer road HR than their 56. No team came close to allowing fewer road HR than their 55.
My hunch is that it's a single-season anomaly related to the team's personnel rather than their ballpark -- Dodger Stadium has been between .98 and 1.04 for HR from 1999-2002, and I don't recall anybody blaming "the wind blowing out" for extraordinary high run totals those 2-1 slugfests in LA.
Weather or not (hehehe), this imbalance shows up as Park HR factor, and the Dodger pitchers' numbers are adjusted accordingly.
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 5:05 p.m.,
February 1, 2004
(#40) -
Jay Jaffe
(homepage)
I don't claim the comparisons to be anything but what they are, a back-to-back correlation between pitchers who reached a certain threshold of innings in the two years for which I held data and with the findings published by Voros via his older system. To the best of my knowledge those are the first published correlations of DIPS 2.0 data, and they appear to support the conclusions Voros reached with DIPS 1.x using the same inning thresholds.
"[T]he most important single result in sabermetrics over the last five years"? Thanks, but I think that's a bit overstated. I am glad that the work I put into this is appreciated, though. Even many "DIPS skeptics" (including an MLB front-office person) have written to tell me that they're glad to see the data because it has some uses to them.
I will concede that RossCW has a point in #37 in that I haven't made any comparison of 98-99 DIPS 1.x to 02-03 DIPS 1.x and 98-99 DIPS 2.0 to 02-03 DIPS 2.0. I am but one man with limited capabilities, and while I've made a good faith effort to do the task I set out to do with as much accuracy as possible, I don't have the time or level of interest to build a spreadsheet that would run the older formula over a new set of data and vice versa. The methodologies for both are out there, though, so if somebody else wants to do so...
And while I see Ross' point about a non-random sample, I'm not sure how meaningful a comparison of, say, pitchers who pitched 100 innings in Season 1 and at least 1 inning in Season 2 would be -- the "sample size" issues seem obvious when it comes to small amounts of playing time. I think the general consensus here would be to use a baseline that has some meaningful level of playing time.
For what it's worth, I've rerun the 2002-2003 comparisons at a lower threshold, 50 innings in each season. The results are not as strong as at higher thresholds and there's a weird "hump" by which many of the 100 inning correlations are higher than either the 162 or the 50 inning ones, but dERA still correlates better than ERA with the following season's ERA.
category/DIPS 2.0
Years/02-03
Baseline 162/100/50
Number of P/56/96/214
$BB/.673/.733/.554
$SO/.801/.824/.767
$HR/.372/.272/.246
$H/.106/.132/.080
Years/02-03
Baseline IP/162/100/50
Number of P/56/96/214
ERA to next ERA/.288/.378/.325
dERA to next ERA/.513/.524/.432
Alas, I don't have XBH data in my sheets so that I could compare the correlations of component ERA against dERA and ERA.
Ross, you make one point there about "team average hits" etc., which is not accurate -- DIPS 1 used team averages, but DIPS 2 uses league averages.
Futility Infielder - 2003 DIPS (January 27, 2004)
Posted 1:00 a.m.,
February 2, 2004
(#42) -
Jay Jaffe
(homepage)
BTW - are your baseline numbers inclusive - i.e. does the over 50 IP include pitchers who pitched over 100?
Yes, all of those groups (50, 100, 162) include everybody who met or exceeded the number of innings pitched in both seasons.